
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import warnings
warnings.filterwarnings('ignore')
df_a=pd.read_excel('Athletes.xlsx')
a=df_a[['Name','Discipline']].groupby(df_a['Name']).agg('count')
a=a[a['Name']==2]
a.drop(columns=['Discipline'],axis=1,inplace=True)
a.rename(columns={"Name":'No. of Discipline'})
| No. of Discipline | |
|---|---|
| Name | |
| ALI Mohamed | 2 |
| ALVAREZ Jorge | 2 |
| CHEN Yang | 2 |
| DYGERT Chloe | 2 |
| GANNA Filippo | 2 |
| HALL James | 2 |
| HAVIK Yoeri | 2 |
| KIM Hyunsoo | 2 |
| KOPECKY Lotte | 2 |
| KOVACS Zsofia | 2 |
| KURBANOV Ruslan | 2 |
| LI Qian | 2 |
| MARTIN Daniel | 2 |
| PALTRINIERI Gregorio | 2 |
| PEREZ Maria | 2 |
| PEREZ Paola | 2 |
| PORTELA Teresa | 2 |
| SUN Jiajun | 2 |
| WANG Yang | 2 |
| WATANABE Yuta | 2 |
| WELLBROCK Florian | 2 |
| ZHANG Xin | 2 |
| van ROUWENDAAL Sharon | 2 |
There are 23 Athletes from different country taking participate in two different Discipline.
px.histogram(df_a,y='NOC',color='Discipline',height=3500,title='Country with Athletes on different Discipline')
From above histogram, it is clear thatUSA, Japan and so on are having more player in Olympic and they all are participating in different Discipline.
a=df_a['NOC'].value_counts()
df_a1=pd.DataFrame({'NOC':a.keys(),'Player':a.values})
df_a1.loc[df_a1['Player']<=200,'NOC']='Other countries'
px.pie(df_a1,values='Player',names='NOC',title='Player by Country')
From the above bar diagram, It is crystal and clear that most of the players are fromUSA,Japan,Australia and so on.
a=df_a['Discipline'].value_counts()
px.histogram(y=a.keys(),x=a.values,height=1000,title='Players participate in different Discipline')
From the above visjalization, It is clear that maximum athletes are participate in Athletics,Swimming,Football,Rowing and so on.
df_t=pd.read_excel('Teams.xlsx')
px.bar(df_t,x='NOC',y='Discipline',color='Event',width=1200)
From above bar diagram, It is clear that the highest number of team having with USA.
px.bar(df_t,x='Discipline',y='NOC',color='Event')
From above bar diagram, It is clear that Swimming, Athletics, Archery and so on having highest number of team in Olympic 2020.
df_g=pd.read_excel('EntriesGender.xlsx')
df_g.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 46 entries, 0 to 45 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Discipline 46 non-null object 1 Female 46 non-null int64 2 Male 46 non-null int64 3 Total 46 non-null int64 dtypes: int64(3), object(1) memory usage: 1.6+ KB
px.histogram(df_g,x="Discipline",y=['Male','Female'])
From the above histogram, it is clear that maximum male are participated in Athletics, Swimming, Football and so on where as maximum female are participated in Athletics, Swimming, Football, Rowing and so on.
Overall, the highest athletes are in Athletics discipline.
df_c=pd.read_excel('Coaches.xlsx')
df_c.head()
| Name | NOC | Discipline | Event | |
|---|---|---|---|---|
| 0 | ABDELMAGID Wael | Egypt | Football | NaN |
| 1 | ABE Junya | Japan | Volleyball | NaN |
| 2 | ABE Katsuhiko | Japan | Basketball | NaN |
| 3 | ADAMA Cherif | Côte d'Ivoire | Football | NaN |
| 4 | AGEBA Yuya | Japan | Volleyball | NaN |
px.histogram(df_c,y='NOC',color='Discipline',height=1500)
From the above histogram, highest number of coaches are with 'Japan' in Olympic 2020 whereas USA and Spain are in the Second highest position and Australia is at third position having maximum number of coaches.
Top 50 Higest Number of Coaches
df_m=pd.read_excel('Medals.xlsx')
df_m.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 93 entries, 0 to 92 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Rank 93 non-null int64 1 Team/NOC 93 non-null object 2 Gold 93 non-null int64 3 Silver 93 non-null int64 4 Bronze 93 non-null int64 5 Total 93 non-null int64 6 Rank by Total 93 non-null int64 dtypes: int64(6), object(1) memory usage: 5.2+ KB
px.bar(df_m,y='Team/NOC',x=['Gold','Silver','Bronze'],height=1500)
From the above histogram,
America, second highest is China and third highest is Japan.America, China the second highest and Russia the third highest.America, Russia the second highest and Great Britain and Australia is the third.df_m1=df_m.copy()
df_m1.loc[df_m['Total']<=15,'Team/NOC']='Other countries'
px.pie(df_m1, values='Total', names='Team/NOC', title='Medal won by Country')
From above bar diagram, it is clear that the highest medals is won by USA athletes, second highest is China and third highest is Rassia and so on.
df_combine_a=pd.DataFrame({'NOC':df_a.NOC.value_counts().keys(),'No_of_Athletes':df_a.NOC.value_counts().values})
df_combine_c=pd.DataFrame({'NOC':df_c.NOC.value_counts().keys(),'No_of_Coaches':df_c.NOC.value_counts().values})
df_combine_m=pd.DataFrame({'NOC':df_m['Team/NOC'],'Medals':df_m['Total'],'Gold':df_m['Gold'],'Silver':df_m['Silver'],'Bronze':df_m['Bronze']})
df_combine=pd.merge(left=df_combine_a,right=df_combine_c,how='outer',on='NOC')
df_combine=pd.merge(left=df_combine,right=df_combine_m,how='outer',on='NOC')
px.histogram(df_combine[:61],y='NOC',x=['No_of_Athletes','No_of_Coaches','Medals'],barmode='group',height=1500)
From the above histogram,
USA
China took first poition by wining 133 medals where 39 are gold,41 are Silver and 33 are Bronze.
The highest number of Athletes took participated.
There are 28 coaches for Athletes training from USA which is second highest.
China
China took second poition by wining 88 medals where 38 are gold,32 are Silver and 18 are Bronze.
The forth highest number of players participated.
The sixth highest number of coaches are there for Athletes in different Discipline training i.e 12.
Japan
Japan took third poition by wining 58 medals where 27 are gold,14 are Silver and 17 are Bronze.
The second highest number of Athletes took participated.
The highest number of coaches are there for Athletes in different Discipline training i.e 35.
6th
Great Britain
Great Britain took forth poition by wining 65 medals where 22 are gold,21 are Silver and 22 are Bronze.
The eighth highest number of players participated.
There are only 7 coaches for Athletes in different Discipline training.
Russia
Russia took fifth poition by wining 71 medals where 20 are gold,28 are Silver and 23 are Bronze.
The eleventh highest number of players participated.
The sixth highest number of coaches are there for Athletes in different Discipline training i.e 12.
Australia
Australia took sixth poition by wining 46 medals where 17 are gold,7 are Silver and 22 are Bronze.
The third highest number of players participated.
The third highest number of coaches are there for Athletes in different Discipline training i.e 22
Netherlands
Australia took seventh poition by wining 36 medals where 10 are gold,12 are Silver and 14 are Bronze.
The thirteenth highest number of players participated.
The number of coaches are there for Athletes in different Discipline training i.e 10
France
France took eighth poition by wining 33 medals where 10 are gold,12 are Silver and 11 are Bronze.
The sixth highest number of players participated.
There are only 10 coaches for Athletes in different Discipline training.
Germany
Germany took ninth poition by wining 37 medals where 10 are gold,11 are Silver and 16 are Bronze.
The fifth highest number of players participated.
There are only 9 coaches for Athletes in different Discipline training.
Italy
Italy took tenth poition by wining 40 medals where 10 are gold,10 are Silver and 20 are Bronze.
The ninth highest number of players participated.
The forth highest number of coaches are there for Athletes in different Discipline training i.e 16.
Canada
Canada took eleventh poition by wining 24 medals where 7 are gold,6 are Silver and 11 are Bronze.
The seventh highest number of players participated.
The forth highest number of coaches are there for Athletes in different Discipline training i.e 16.
